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Method and device for acoustic echo cancellation combined with adaptive4)eaniforming, 
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The present invention relates to a method wherein multiple input signals are 
subjected to a combination process of adaptive beamforming and adaptive echo cancelling. 

The present invention also relates to an audio processing device comprising at 
least one parallel acoxistic path for providing respective inputs signals, the acoustic paths are 
5 connected in series to beamformer paths, and comprising an adaptive beamformer and an 
adaptive echo canceller for performing adaptive echo cancelling; and to a communication 
device such as found in audio broadcast systems, audio and/or video conferencing systems, 
speech enhancement, such as in telephone, like mobile telephone systems, speech recognition 
systems, speaker authentication systems, speech coders and the like, provided with such an 
1 0 audio processing device. 



Such a method and devices are known from: 1997 IEEE International 
Conference on Acoustics, Speech, and Signal Processing, Vol I, April 21-24, 1997, 

1 5 "Strategies for Combining Acoustic Echo Cancellation and Adaptive Beamforming 

Microphone Arrays" by Walter Kellermann, pp 219-222, Munich, Germany. In particular a 
strategy is described, wherein a common beamforming method is decomposed into a time- 
invariant stage followed by a time-variant stage in order to avoid computational complexity 
and circumvent a time variant beamforming in an acoustic echo cancelling device. As a 

20 consequence thereof the known strategy is restricted in its application possibilities. In 

addition it does not address the fundamental problem of combining the techniques of acoustic 
echo cancelling and adaptive beamforming, such that both can be applied simultaneously and 
independently from one another, irrespective the different adaptation time scales involved. 

25 

Therefore it is an object of the present invention to provide such a combined 
echo cancelling and adaptive beamforming method and device, wherein the distinct 
advantages of both techniques are retained, and wherein the necessary computations, despite 
the combined techniques are reduced to an acceptable level. 
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Thereto the method according to the invention is characterized in that for each 
of the input signals an individual processing history of adaptive echo cancelling data is kept 
and combined with current adaptive beatnforming data. 

Accordingly the communication device viz. the audio processing device 
5 according to the present invention is characterized in that the adaptive echo canceller is 
provided with storage means for storing in relation to every input signal, individual 
processing histories of adaptive echo cancelling data for combination with current adaptive 
beamforming data. 

It is an advantage of the method and device according to the present invention 
10 that by storing the processing history of the adaptive echo cancelling data of each input signal 
individually and by combining this data with current beamformer data the combined use of 
these data reveals an improved accuracy of the echo cancelling process. In particular updated 
coefficients of the beamformer, which change faster than the maximum tracking speed of the 
adaptive acoustic echo cancelling filter, are available for accurately calculating echo 
1 5 cancelling data for each individual input signal. The adaptive echo cancelling filter, which is 
usiially very complex and may contain up to a few thousands coefficients can now be 
implemented more easily, while the niraxber of necessary calculations is reduced 
considerably. 

An embodiment of the method according to the invention is characterized in 
20 that the combined adaptive processing is devised such that each of the respective input 

signals is running through a parallel path containing an acoustic path and a beamformer path, 
whereafter signals in the parallel paths are siwnmed and then processed. Accordingly the 
audio processing device is characterized in that the audio processing device is devised such 
that each of the respective input signals is running through a parallel path containing an 
25 acoustic path and a beamformer path, whereafter signals in the parallel paths are summed and 
then processed. Advantageously an adaptive echo canceller for performing the adaptive 
processing only needs to be connected at the summed end of the parallel paths and between 
the connections to and firom the far end of a communication line. Advantageously no separate 
connections with the individual input paths are necessary, saving processor capacity. 
30 A further embodiment of the method according to the invention is 

characterized in that adaptive beamforming concerns filtering or weighting of the input 
signals. The audio processing device is characterized accordingly. 

When the adaptations made in the beamformer concern filtering the input 
signals are filtered such as for example with Finite Impulse Response (FIR) filters, or Infinite 
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Response Filters (IRF) jSlters. In that case one speaks of a Filtered Sum Beamformer (FSB), 
whereas in a special case thereof, called a Weighted Sum Beamformer (WSB) the filters are 
replaced by real gains or attenuations. 

A still further embodiment of the audio processing device is characterized in 
5 that the adaptive echo canceller comprises a Transform Domain Adaptive Filter, such as for 
example a Time Domain Adaptive Filter (TDAF), or a Frequency Domain Adaptive Filter 
(FDAF). Generally FDAF's are preferred in terms of their computational complexity, which 
shows the additional advantage of a faster convergence if use is made of spectral 
normalization of its input. 

10 A preferred embodiment of the audio processing device according to the 

invention is characterized in that the adaptive echo canceller comprises a first section for 
calculating at least one loudspeaker input spectrum and a part of normalized update data, and 
a second section for performing convolution and calculating echo cancelling coefficient 
update data. In a particular embodiment saving a lot of computations specifically if the 

1 5 nxmiber of beamformer input signals grows, the second adaptive echo canceller section 

comprises an adaptive summing filter having an input for receiving beamformer filtering or 
weighting coefficients, the simmiing filter comprising the storage means for storing in 
relation to every input signal, individual processing histories of adaptive echo cancelling data 
for combination with current adaptive beamforming data. 

20 

At present the method, audio processing and communication device according 
to the invention will be elucidated further together with their additional advantages while 
reference is being made to the appended drawing, wherein similar components are being 
25 referred to by means of the same reference nimierals. In the drawing: 

Fig. 1 shows an embodiment of an audio processing device according to the 
invention equipped with an adaptive means for acoustic echo cancellation and an adaptive 
means for beamforming of multiple input signals; 

Fig. 2 shows a schematic representation of a preprocessor and a pos^rocessor 
30 part of a Frequency Domain Adaptive Filter (FDAF) implementing the acoustic echo 

canceller means for application in the audio processing device according to the invention; 
and 
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4 22.05.2000 
Fig. 3 shows an adaptive scheme of an adaptive summing filter for application 
in the device of fig. 2, wherein echo cancelling filter coefficients are being stored and 
updated. 



5 

Fig. 1 shows an audio processing device 1 in the form of a commxmication 
device comprising a connection to and firom a far-end (not shown). The connection from the 
far-end receives a signal x(n) (n=... -1, 0, 1, n being the sampling index) for a loudspeaker 
2 firom the device 1. The device 1 may contain more than one loudspeaker 2, The device 1 

10 further comprises a parallel arrangement of microphones 3-1, 3-2, 3-S providing S multiple 
input signals zi(n), Z2(n), ... zs(n). These input signals are fed to a beamformer 4. The 
beamformer 4 may have the form of a so called Filtered Sum Beamformer (FSB), then 
having filter impvdse responses fi, fi, ... fs, or have the form of a Weighted Sxmi Beamformer 
(WSB), which is a FSB whose filters are replaced by real gains or attenuations wi, W2, ... ws. 

15 These responses and gains are continuously subjected to adaptations, that is changes in time. 
An adaptation control of the beamformer 4 controls this adaptation process. Such 
beamformer adaptations can for example be made for focussing on a different speaker 
location, such as known fi-om EP-A-0954850. Adaptations can also be made in order to 
reduce the overall signal-to-noise ratio. The adapted signals in the beamformer 3 are summed 

20 in an internal summing means 5 resulting in output signal z(n), and then fed to an external 
summing device 6. 

The audio processing device 1 fiirther comprises an adaptive echo cancelling 
means or filter 7 coupled between the far-end connections for performing adaptive echo 
cancelling. Thereto the instantaneous or current filter responses or gains/attenuations from 

25 the adaptive beamformer 4 are fed to the adaptive echo canceller filter 7 for use thereby. Also 
the far-end input signal x(n) is fed to the filter 7. The filter 7 models respective acoustic paths 
having acoustic impulse responses hi, hi, ... hs, while taking the current beamformer 
coefficients into account and such that an output signal y(n) of the filter 7 is made 
approximately equal to the echo component of the output signal z(n). The summing device 6 

30 provides an output signal to the far-end which is virtually free of acoustic echoes. The 

adaptive filter 7 performs a convolution between the signal x(n) and its modelled impulse 
response model h to reveal the wanted signal y(n). Many algorithms are known in the 
literature for calculating and adaptively optimizing the filter coefficients h of the adaptive 
filter 7, which usually is very complicated due to the several thousands of coefficients 
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necessary to implement the adaptive filter 7. The echo cancelling filter 7 can be implemented 
in any suitable domain in particular for example in the time domain, or the fi-equency 
domain. If the device 1 contains more than one loudspeaker then a corresponding number of 
filters 7 is necessary for compensating that number of echos. 

Fig. 2 shows an a schematic representation of a preprocessor (upper) part and 
a postprocessor (lower) part of a Frequency Domain Adaptive Filter (FDAF) implementing 
the adaptive filter 7 of the audio processing device 1 , In the preprocessor it is schematically 
shown that the loudspeaker signal x(n) is series-parallel converted (S/P) into blocks 
containing B samples. Next an array is formed consisting of these B samples preceded by M- 
B previous samples. Then a real Fast Fourier Transform (FFT) is performed on the last M 
samples of parallel data leading to the complex frequency spectrum of x(n) denoted by X. 
The preprocessor also comprises a normalizer calculating the complex conjugate spectrum of 
X denoted by X* to finally reveal in a way not elucidated fiirther the complex spectrum 
normalized by its input power spectrum Pxx- This particular algorithm thus normalized shows 
a convergence behavior which is independent of the input power. 

The postprocessor (lower) part multiplies the complex input spectrum X by the 
frequency domain FDAF coefficients H and performs an Inverse FFT. The first M-B samples 
of the result of the IFFT are discarded since these are polluted by cyclic convolution errors. 
The resulting B samples forming the signal y(n) are subtracted from the newest B samples 
forming the signal z(n) yielding B samples of a residual signal r(n) fed back to the 
postprocessor. After parallel series conversion (P/S) this signal r(n) is sent to the far-end. 
Next the fed back signal is preceded by appropriate zeros, transformed (FFT) to the 
frequency domain and multiplied by the normalized complex spectrum to give an update 
term for the FDAF coefficients. Finally the FDAF coefficients are updated with this update 
term in an update loop 8. The update loop 8 contains a constraint in the time domain, if no 
programmable filter is used. The constraint prevents cyclic convolution errors to occur. 
Absence of the constraint saves an FFT and an IFFT for each upgrade. See US 4 903 247, 
which is considered to be included here by reference thereto. 

The update loop 8 contains a building block 9 in the form of an adaptive 
sximming filter, which is elucidated fiirther in fig. 3. The beamformer coefficients, that is to 
say the gains, or impulse responses, or their Fourier transforms: wi, W2, ». ws, or fi, fi, ... fs, 
or Fl, F2, ... Fs respectively that are constantly adapted by beamformer 4 are supplied to the 
adaptive filter 7, in particular to the building block 9 as shown in fig. 2. The building block 9 
contains S consecutive loops 10-1, ... 10-S such that for each of the S input signals an 
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individual processing history of at least adaptive and/or updated adaptive echo cancelling 
coefficients is stored in delay elements z'^ for use during adaptive echo cancelling processing. 
Summing devices 15 and 16 provide simuned current frequency domain FDAF coefficients 
H(k:;lB), v/herein k is the firequency band or bin, k=0 ... M-1, having M frequency domain 
5 adaptive filter coefficients, and 1b is the iteration index, which is increased by imity once 

every B sampling instance. The FFT transformed and normalized residual signal r(n) updates 
the summed current coefficients in summing device Sm (see fig, 2) and provides new 
adaptive filter coefficients H(k;lB+l) to sununing device 1 1 for comparison with the above 
mentioned summed current FDAF coefficients. In each individual loop 10-1, ... 10-S the 

10 result of this comparison is multiplied in multipliers 12-1, ... 12-S by |j.i(k)Fi(k;lB) ... 

^iFs(k;lB), where is the stepsize. In summing devices 13-1, ... 13-S the multiplied results 
are for each separate signal accumulated. The results in the form of the updated data are 
stored during the next iteration in the storage/delay elements z'^ Then multiplied by Fi(k;lB) 
... Fs(k;lB) in multipliers 14-1, ... 14-S and summed in the above mentioned two sunmiing 

15 devices 15 and 16 respectively. Summarizing it is shown that in this case the combined 
beamformer and echo cancelling update model kept up to date can be represented by: 

P=S 

Hm(k;lB-i-l)-H„,(k;lB)+Kii(k)Fn,(k;lB){H(k;lB+l> S Fp(k;lB)Hp(k;lB)} 

P=l 

20 for m=0, ... S, where S represents the total number of beamformer inputs/microphones; 

k=0,l, ... M-1, where there are M frequency domain adaptive filter coefficients at the Is-th 
iteration, p represents the beamformer input concerned and; wherein further: 
1b is the iteration index, which is increased by unity once every B sampling instance; 
Hm(k;lB) is the k-th adaptive filter coefficient at the Is-th iteration of the acoustic frequency 

25 domain transformed impulse response from the loudspeaker concerned to microphone m (or 
of beamformer input m); 
III is the stepsize (to be elucidated hereunder); 

Fm(k;lB) is the frequency domain adaptive beamformer filter (gain/attenuation) coefficient of 
input m in the k-th frequency band, during the is-th iteration; 
30 H(k;lB+l) is the updated frequency domain transformed impulse response summed over all 
inputs (from loudspeaker to beamformer output) in the k-th frequency band, during the is-th 
iteration. 

A good value for FSB is: 
m=S 
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Hi(k)=l/i:|Fn.(k;lB)P 
m=l 

resulting in ^Li(k)=l for all k, since the denominator (approximately) equals 1 in the case of 
an FSB according to EP-A-0954850. Similarly in a special case of FSB, that is WSB one may 
chose: 

m=S 

m = Sw„,'(1b) 
m=l 

for the same reason resulting in ]Xi=l, 

Whilst the above has been described with reference to essentially preferred 
embodiments and best possible modes it will be understood that these embodiments are by no 
means to be construed as limiting examples of the devices concerned, because various 
modijBcations, features and combination of features falling within the scope of the appended 
claims are now within reach of the skilled person. 

The above techniques may be combined with a technique implementing a 
plurality of loudspeakers, such that building block 9 is present as many times as there are 
loudspeakers. Stereo echo cancelling can also be applied. In addition a Dynamic Echo 
Suppressor (DES) may be coupled to the far-end output of the device 1 for providing 
additional echo suppression. 
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1 . A method wherein multiple input signals are subjected to a combination 
process of adaptive beamforming and adaptive echo cancelling, characterized in that for each 
of the input signals an individual processing history of adaptive echo cancelling data is kept 
and combined v^th current adaptive beamforming data. 

2. The method according to claim 1, characterized in that the combined adaptive 
processing is devised such that each of the respective input signals is running through a 
parallel path containing an acoustic path and a beamformer path, w^hereafter signeds in the 
parallel paths are summed and processed. 

3. The method according to claim 1 or 2, characterized in that adaptive 
beamforming concems filtering or weighting of the input signals. 



4. An audio processing device comprising at least one parallel acoustic path for 
1 5 providing respective input signals, the acoustic paths are connected in series to beamformer 

paths, the device comprises an adaptive beamformer and an adaptive echo canceller, 
characterized in that the adaptive echo canceller is provided with storage means for storing in 
relation to every input signal, individual processing histories of adaptive echo cancelling data 
for combination with current adaptive beamforming data. 

20 

5. The audio processing device according to claim 4, characterized in that the 
audio processing device is devised such that each of the respective input signals is miming 
through a parallel path containing an acoustic path and a beamformer path, whereafter signals 
in the parallel paths are summed and processed. 

25 

6. The audio processing device according to claim 4 or 5, characterized in that 
the adaptive beamformer is a filtered and/or weighted beamformer. 
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7. The audio processing device according to one of the claims 4-6, characterized 

in that the adaptive echo canceller comprises a Transform Domain Adaptive Filter, such as 
for example a Time Domain Adaptive Filter (TDAF), or a Frequency Domain Adaptive Filter 
(FDAF). 



10 
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8. The audio processing device according to one of the claims 4-7, characterized 
in that the adaptive filter comprises a first section for calculating at least one loudspeaker 
input spectrum and a part of normalized update data, and a second section for performing 
convolution and calculating echo cancelling coefficient update data, 

9. The audio processing device according to claim 8, characterized in that the 
second adaptive echo canceller section comprises an adaptive summing filter having an input 
for receiving beamformer filtering or weighting coefficients, the summing filter comprising 
the storage means for storing in relation to every input signal, individual processing histories 
of adaptive echo cancelling data for combination wdth current adaptive beamforming data- 



ID. A commxmication device such as found in audio broadcast systems, audio 

and/or video conferencing systems, speech enhancement, such as in telephone, like mobile 
telephone systems, speech recognition systems, speaker authentication systems, speech 

20 coders and the like, the conmiunication device having an audio processing device according 
to one of the claims 4-9, the audio processing device comprising at least one loudspeaker, 
multiple microphones for providing respective inputs signals, which microphones are 
coupled to the at least one loudspeaker through acoustic paths, an adaptive beamformer and 
an adaptive echo canceller, characterized in that the adaptive echo canceller is provided with 

25 storage means for storing in relation to every input signal an individual processing history of 
adaptive echo cancelling data for combination with current adaptive beamforming data. 
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ABSTRACT: 
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A method is described, wherein mxiltiple input signals are subjected to a 
combination process of adaptive beamforming and adaptive echo cancelling, and wherein for 
each of the input signals an individual processing history of adaptive echo cancelling data is 
kept and combined with current adaptive beamforming data. Accordingly an audio 
processing device is described which comprises at least one parallel acoustic paths for 
providing respective inputs signals, the acoustic paths are connected in series to beamformer 
paths, and the device comprises an adaptive beamformer and an adaptive echo canceller for 
performing adaptive beamforming and adaptive echo cancelling respectively, whereby the 
adaptive echo canceller is provided with storage means for storing in relation to every input 
signal, individual processing histories of adaptive echo cancelling data for combination with 
current adaptive beamforming data. Both beamformer and echo cancelling techniques can be 
combined such that a reduced number of calculations results. 
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